We study the structure of the social graph of mobile phone users in thecountry of Mexico, with a focus on demographic attributes of the users (morespecifically the users' age). We examine assortativity patterns in the graph,and observe a strong age homophily in the communications preferences. Wepropose a graph based algorithm for the prediction of the age of mobile phoneusers. The algorithm exploits the topology of the mobile phone network,together with a subset of known users ages (seeds), to infer the age ofremaining users. We provide the details of the methodology, and showexperimental results on a network GT with more than 70 million users. Bycarefully examining the topological relations of the seeds to the rest of thenodes in GT, we find topological metrics which have a direct influence on theperformance of the algorithm. In particular we characterize subsets of usersfor which the accuracy of the algorithm is 62% when predicting between 4 agecategories (whereas a pure random guess would yield an accuracy of 25%). Wealso show that we can use the probabilistic information computed by thealgorithm to further increase its inference power to 72% on a significantsubset of users.
展开▼